Deep learning is bringing breakthroughs to many computer vision subfields including\nOptical Music Recognition (OMR), which has seen a series of improvements to musical symbol\ndetection achieved by using generic deep learning models. However, so far, each such proposal has\nbeen based on a specific dataset and different evaluation criteria, which made it difficult to quantify\nthe new deep learning-based state-of-the-art and assess the relative merits of these detection models\non music scores. In this paper, a baseline for general detection of musical symbols with deep learning\nis presented. We consider three datasets of heterogeneous typology but with the same annotation\nformat, three neural models of different nature, and establish their performance in terms of a common\nevaluation standard. The experimental results confirm that the direct music object detection with\ndeep learning is indeed promising, but at the same time illustrates some of the domain-specific\nshortcomings of the general detectors. A qualitative comparison then suggests avenues for OMR\nimprovement, based both on properties of the detection model and how the datasets are defined.\nTo the best of our knowledge, this is the first time that competing music object detection systems from\nthe machine learning paradigm are directly compared to each other. We hope that this work will\nserve as a reference to measure the progress of future developments of OMR in music object detection.
Loading....